34 research outputs found

    Stack Overflow in Github: Any Snippets There?

    Full text link
    When programmers look for how to achieve certain programming tasks, Stack Overflow is a popular destination in search engine results. Over the years, Stack Overflow has accumulated an impressive knowledge base of snippets of code that are amply documented. We are interested in studying how programmers use these snippets of code in their projects. Can we find Stack Overflow snippets in real projects? When snippets are used, is this copy literal or does it suffer adaptations? And are these adaptations specializations required by the idiosyncrasies of the target artifact, or are they motivated by specific requirements of the programmer? The large-scale study presented on this paper analyzes 909k non-fork Python projects hosted on Github, which contain 290M function definitions, and 1.9M Python snippets captured in Stack Overflow. Results are presented as quantitative analysis of block-level code cloning intra and inter Stack Overflow and GitHub, and as an analysis of programming behaviors through the qualitative analysis of our findings.Comment: 14th International Conference on Mining Software Repositories, 11 page

    SourcererCC: Scaling Code Clone Detection to Big Code

    Full text link
    Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. SourcererCC uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone. We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks, (1) a large benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (250MLOC) using a standard workstation.Comment: Accepted for publication at ICSE'16 (preprint, unrevised

    Towards Automating Precision Studies of Clone Detectors

    Full text link
    Current research in clone detection suffers from poor ecosystems for evaluating precision of clone detection tools. Corpora of labeled clones are scarce and incomplete, making evaluation labor intensive and idiosyncratic, and limiting inter tool comparison. Precision-assessment tools are simply lacking. We present a semi-automated approach to facilitate precision studies of clone detection tools. The approach merges automatic mechanisms of clone classification with manual validation of clone pairs. We demonstrate that the proposed automatic approach has a very high precision and it significantly reduces the number of clone pairs that need human validation during precision experiments. Moreover, we aggregate the individual effort of multiple teams into a single evolving dataset of labeled clone pairs, creating an important asset for software clone research.Comment: Accepted to be published in the 41st ACM/IEEE International Conference on Software Engineerin

    A Systematic Review and Comparative Meta-analysis of Non-destructive Fruit Maturity Detection Techniques

    Get PDF
    The global fruit industry is growing rapidly due to increased awareness of the health benefits associated with fruit consumption. Fruit maturity detection plays a crucial role in fruit logistics and maintenance, enabling farmers and fruit industries to grade fruits and develop sustainable policies for enhanced profitability and service quality. Non-destructive fruit maturity detection methods have gained significant attention, especially with advancements in machine vision and spectroscopic techniques. This systematic review provides a concise overview of the techniques and algorithms used in fruit quality grading by farmers and industries. The study reviewed 63 full-text articles published between 2012 and 2023 along with their bibliometric analysis. Qualitative analysis revealed that researchers from various disciplines contributed to this field, with techniques falling into 3 categories: machine vision (mathematical modelling or deep learning), spectroscopy and other miscellaneous approaches. There was a high level of diversity among these categories, as indicated by an I-square value of 88.37% in the heterogeneity analysis. Meta-analysis, using odds ratios as the effect measure, established the relationship between techniques and their accuracy. Machine vision showed a positive correlation with accuracy across different categories. Additionally, Egger's and Begg's tests were used to assess publication bias and no strong evidence of its occurrence was found. This study offers valuable insights into the advantages and limitations of various fruit maturity detection techniques. For employing statistical and meta-analytical methods, key factors such as accuracy and sample size have been considered. These findings will aid in the development of effective strategies for fruit quality assessment

    Plasmodium falciparum PhIL1-associated complex plays an essential role in merozoite reorientation and invasion of host erythrocytes.

    Get PDF
    The human malaria parasite, Plasmodium falciparum possesses unique gliding machinery referred to as the glideosome that powers its entry into the insect and vertebrate hosts. Several parasite proteins including Photosensitized INA-labelled protein 1 (PhIL1) have been shown to associate with glideosome machinery. Here we describe a novel PhIL1 associated protein complex that co-exists with the glideosome motor complex in the inner membrane complex of the merozoite. Using an experimental genetics approach, we characterized the role(s) of three proteins associated with PhIL1: a glideosome associated protein- PfGAPM2, an IMC structural protein- PfALV5, and an uncharacterized protein-referred here as PfPhIP (PhIL1 Interacting Protein). Parasites lacking PfPhIP or PfGAPM2 were unable to invade host RBCs. Additionally, the downregulation of PfPhIP resulted in significant defects in merozoite segmentation. Furthermore, the PfPhIP and PfGAPM2 depleted parasites showed abrogation of reorientation/gliding. However, initial attachment with host RBCs was not affected in these parasites. Together, the data presented here show that proteins of the PhIL1-associated complex play an important role in the orientation of P. falciparum merozoites following initial attachment, which is crucial for the formation of a tight junction and hence invasion of host erythrocytes

    Comparative Study of RDBMS, NOSQL and Graph Databases

    Get PDF
    The paper aims at analysis and comparison of various forms of databases particularly computer database Management System (RDBMS), Not solely SQL (NOSQL), Graph Databases. The Structured source language is employed by applications to access computer database systems containing informative during a semi declarative language whereas NOSQL databases area unit supported the key-value pairs. Graph info uses graph structures for resolution queries and to represent and store knowledge
    corecore